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Abstract 

We investigate structural complexity measures on digraphs, in par- 
ticular the cycle rank. This concept is intimately related to a classical 
topic in formal language theory, namely the star height of regular lan- 
guages. We explore this connection, and obtain several new algorithmic 
insights regarding both cycle rank and star height. Among other results, 
we show that computing the cycle rank is NP-complete, even for sparse 
digraphs of maximum outdegree 2. Notwithstanding, we provide both a 
polynomial-time approximation algorithm and an exponential-time exact 
algorithm for this problem. The former algorithm yields an C'((logn)'^''^)- 
approximation in polynomial time, whereas the latter yields the optimum 
solution, and runs in time and space O* (1.9129") on digraphs of maximum 
outdegree at most two. 

Regarding the star height problem, we identify a subclass of the reg- 
ular languages for which we can precisely determine the computational 
complexity of the star height problem. Namely, the star height prob- 
lem for bideterministic languages is NP-complete, and this holds already 
for binary alphabets. Then we translate the algorithmic results concern- 
ing cycle rank to the bideterministic star height problem, thus giving a 
polynomial-time approximation as well as a reasonably fast exact expo- 
nential algorithm for bideterministic star height. 
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1 Introduction 



In the theory of undirected graphs, structural complexity measures for graphs, 
such as treewidth and pathwidth, have gained an important role, both from a 
structural and an algorithmic viewpoint, see e.g. I26|. However, networks 
arising in some domains are more adequately modeled as having directed edges. 
Therefore in recent years, attempts have been made to lift such measures and 
parts of the theory of undirected graphs to the case of digraphs. Several recent 
works show that, while there often exist partial analogues to the undirected case, 
the picture for digraphs is much more involved |5j [6l [TH [27l [S^ . We discuss some 
of these measures, relate them to each other, and investigate their algorithmic 
aspects. Interestingly, we are able to show that all these complexity measures 
bound each other within a factor logarithmic in the order of the digraph, thus 
paralleling the case of undirected graphs [9j. We focus in particular on the 
cycle rank, a digraph complexity measure originally motivated by studies in 
formal languages [12,. Apparently, there is a renewed interest in this measure, 
as witnessed by recent research efforts j2J HI [TU |22j 28 . 

We obtain the following results on computing the cycle rank: The deci- 
sion version of the problem is NP-complete, and this remains true for graphs 
of maximum outdegree at most 2. Previously, the problem was known to be 
NP-complete on undirected symmetric digraphs of unbounded degree, see [8]. 
On the positive side, we design a polynomial-time 0((log n)^/^)-approximation 
algorithm, as well as an exact exponential algorithm algorithm computing the 
cycle rank of digraphs. If the given digraph is of bounded outdegree, the lat- 
ter algorithm runs in time and space 0*((2 — £)"), where n is the order of 
the digraph, and e is a constant depending on the maximum outdegree. For 
unbounded outdegree, the running time is still 0*(2"), whereas for maximum 
outdegree 2, we even attain a bound of O* (1.9129"). As a further application, 
we also obtain an exact algorithm for the directed feedback vertex set problem 
on digraphs of maximum outdegree 2, which runs within the same time bound. 

Then we present applications of these findings to the theory of regular ex- 
pressions. The star height of a regular language is defined as the minimum 
nesting depth of stars needed in order to describe that language by a regular 
expression. Already in the 1960s, Eggan [T? raised the question whether the 
star height can be determined algorithmically. It was not until 25 years later 
that Hashiguchi found a rather complicated decidability proof [19J. Even to- 
day, the best known algorithm has doubly exponential running time, and is 
arguably still impractical [25J. Therefore, we study the complexity of the star 
height problem when restricted to a subclass of the regular languages. We show 
that the star height problem for bideterministic languages is NP-complete, and 
this remains true when restricted to binary alphabets. Furthermore, we present 
both an efficient approximation algorithm and an exact exponential algorithm 
for this problem. The key to these results are the corresponding algorithms for 
the cycle rank of digraphs mentioned above; also the above mentioned bounds 
carry over to this application in formal language theory. 

The paper is organized as follows: After this introduction, we recall in Sec- 



2 



tion |2] some basic notions from graph theory and from automata theory. We 
study structural properties of the cycle rank of digraphs in Section [3] Section |4] 
is devoted to algorithmic aspects of cycle rank. Afterward, we apply these find- 
ings in Section [5] to the star height problem on bideterministic languages. We 
complete the paper in Section [6] by showing up possible directions for further 
research. 

2 Preliminaries 

2.1 Digraphs 

We assume familiarity with basic notions in graph theory, as contained in 

so we only fix the notation and a few specialties below. A digraph G = {V, E) 

consists of a finite set of vertices V and a set of edges E C V'^. 

We refer to an edge of the form {v, w) as a loop; A digraph without loops is 
called loop-free. 

The outdegree of a vertex v is defined as the number of vertices u such 
that (u, v) e E. The total degree is defined as the number of distinct vertices u 
having (u, u) e i? or (w, u) £ E. 

If the edge relation of a digraph G is symmetric, we say G is an (undirected) 
graph. By taking the symmetric closure of the edge relation of a digraph, we 
obtain its undirected counterpart — of course, this is a many-to-one correspon- 
dence. 

For a subset of vertices U C V, let G[U] denote the sub(di)graph induced 
by U, which is obtained by restricting the vertex set of G to J7 and redefining 
the edge set E appropriately. In this context, we will often use G — J7 as a 
shorthand for G[V \ U] and G — v for G[V \ {v}]. A subset of vertices U C V 
is strongly connected if for every v £ V there is a (possibly empty) path from v 
to itself. Maximal strongly connected subsets of V are called strongly connected 
components; a strongly connected subset S is nontrivial if the subdigraph G[S] 
induced by S contains at least one edge (note that this also allows the case 
S = {v} if V has a loop). A digraph is acyclic if all of its strongly connected 
components are trivial. 

2.2 Formal Languages 

As with digraphs, we only recall some basic notions in formal language and 
automata theory — for a thorough treatment, the reader might want to consult 
a textbook such as |2T]. In particular, let E be a finite alphabet and E* the 
set of all words over the alphabet E, including the empty word A. The length 
of a word w is denoted by where |A| = 0. A (formal) language over the 
alphabet E is a subset of E*. 

The regular expressions over an alphabet E are defined recursively in the 
usual wayj^ 0, A, and every letter a with a G E is a regular expression; and 

^For convenience, parentheses in regular expressions are sometimes omitted and the con- 
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when ri and r2 are regular expressions, then (ri + ^2), (^i • r2), and (ri)* 
are also regular expressions. The language defined by a regular expression r, 
denoted by L{r), is defined as follows: L(0) = 0, L{X) = {A}, L(a) = {a}, 
L{n + rs) = i(ri) U L(r2), L(ri • ra) = L(ri) • L(r2), and L(r*) = L(ri)*. For 
a regular expression r over S, the siar height, denoted by h(r), is a structural 
complexity measure inductively defined by: h(0) = h(A) = h(a) = 0, h(ri •r2) = 
h(ri + r2) = max (h(ri), h(r2)), and h(rj;) = 1 + h(ri). The star height of a 
regular language L, denoted by h(L), is then defined as the minimum star height 
among all regular expressions describing L. 

It is well known that regular expressions are exactly as powerful as finite 
automata, i.e., for every regular expression one can construct an equivalent 
(deterministic) finite automaton and vice versa, see |21| . Finite automata are 
defined as follows: A nondeterministic finite automaton (NFA) is a 5-tuple 
A = {Q, S, S, qo, F), where Q is a finite set of states, S is a finite set of input 
symbols, (5 : Q x S ^ 2*3 is the transition function, £ Q is the initial state, 
and F C Q is the set of accepting states. The language accepted by the finite 
automaton A is defined as L{A) = {w € T,* \ 6{qo,w) Ci F ^ 0}, where S 
is naturally extended to a function Q x S* ^ 2*3. A nondeterministic finite 
automaton A = (Q, E, i5, Qo, F) is deterministic, for short a DFA, if \d{q, a)| < 1, 
for every q Cz Q and a S S. In this case we simply write d{q, a) = p instead 
of 6{q,a) — {p}. Two (deterministic or nondeterministic) finite automata are 
equivalent if they accept the same language. 

A deterministic finite automaton is hideterministic, if it has a single final 
state, and if the NFA obtained by reversing all transitions and exchanging the 
roles of initial and final state is again deterministic — notice that, by construc- 
tion, this NFA in any case accepts the reversed language. A regular language L 
is hideterministic if there exists a bideterministic finite automaton accepting L. 
These languages form a proper subclass of the regular languages 

3 Cycle Rank of Digraphs 

3.1 Cycle Rank and Directed Elimination Forests 

Originally suggested in the 1960s by Eggan and Biichi in the course of investi- 
gating the star height of regular languages [T^, the cycle rank is probably one 
of the oldest structural complexity measures on digraphs. In this section, we 
delve into the structural foundations of cycle rank. 

Definition 1. The cycle rank of a directed graph G =^ (V, E), denoted by r(G), 
is inductively defined as follows: If G is acyclic, then r(G) =0. If G is strongly 
connected and i? 7^ 0, then r(G) = 1 + mhiy^vi r(G — u) }. If G is not strongly 
connected, then r{G) equals the maximum cycle rank among all strongly con- 
nected components ofG. 

catenation is simply written as juxtaposition. The priority of operators is specified in the 
usual fashion; concatenation is performed before union, and star before both product and 
union. 
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Figure 1: An example digraph and a directed elimination tree for it. 



We note that the requirement E ^ % va the above definition allows to dif- 
ferentiate between acyclic digraphs and (otherwise acyclic) digraphs with loops. 
We also remark that the cycle rank can be equivalently defined using decompo- 
sitions, compare [ 30| : 

Definition 2. A directed elimination tree for a nontrivially strongly connected 
digraph G — {V,E) is a rooted tree T = {T,£) having the following properties: 

a) T QV x2^, and if{x,X) e T, then x e X. 

h) The root of the tree is {v, V) for some v £ V . 

c) There is no pair distinct vertices of the form [x, X) and [y, X) in the forest. 

d) If {x,X) is a node in T, and G[X] — x has J > nontrivial strongly con- 
nected components Yi, . . . ,Yj , then (x, X) has exactly j children of the form 
(yi, Yi), . . . {yj,Yj) for some yi, ...,yjeV. 

A directed elimination forest for a digraph G with k > nontrivial strongly con- 
nected components Ci, . . . Ck, is a rooted forest consisting of directed elimination 
trees for G[Ci], l<i<k. 

Figure [l] illustrates this concept by an example. It is shown in [30_ that the 
minimum height among all directed elimination forests for G equals the cycle 
rank of G. Interestingly, the concept of elimination forests was rediscovered in 
the context of sparse matrix factorization, in |35| for the undirected case and 
in [13j for the directed case. 

3.2 Cycle Rank and other Digraph Complexity Measures 

We compare the cycle rank with two other structural complexity measures, 
namely weak separator number and directed pathwidth. The first measure is a 
generahzation of separator number (see e.g. [9l|33l|T7]) to digraphs: 
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Definition 3. Let G ~ {V, E) he a digraph and let U C V be a set of vertices. A 
set of vertices S is a weak balanced separator for U if every strongly connected 
component of G[U\S] contains at most \^\U — S\\ vertices. T/ie weak separator 
number ofG, denoted by s{G), is defined as the maximum size, taken over all 
subsets U C V , among the minimum weak balanced separators for U . 

Some readers will feel that the above definition is a bit contrived because 
of the ceiling operator [•] . But this is an essential detail, as it guarantees that 
a digraph with a weak balanced separator of size k will always admit a weak 
balanced separator of size k + 1. 

In order to relate weak separator number and cycle rank, we need the fol- 
lowing recurrence: For integers k,n> 1, let Rk{n) be given by the recurrence 



Rk{n) = k + Rk 



-k 



with i?fe(ro) = ro for tq < k. 

Lemma 4. Let G be a loop-free digraph with n vertices and weak separator 
number at most k. Then r(G) < Rk{n) — 1- 

Proof. We generalize a proof given in \V1\ to the case of digraphs. 

Let G^ be the digraph obtained from G by adding self-loops to each vertex. 
Then r(G'^) = r(G') -I- 1, so we may prove instead that r(G'^) < Rk{n). 

We prove the statement by induction on the order n of G^. The base cases 
n < k oi the induction are easily seen to hold, since the cycle rank of a digraph 
is always bounded above by its order. 

For the induction step, assume n > k. As already mentioned, if G^ admits 
a weak balanced separator of size at most fc, then it also has a weak balanced 
separator of size exactly k. Let X be such a separator. 

Denote the strongly connected components of — X by Ci, . . . , Cp. Then 
r(G^) < fc -I- r(G^ — X), and by definition of cycle rank, 

r(G^ -X) < max r(G^[G,]) . 

As X is a weak balanced separator, we have |Gi| < [^^^] for 1 < i < p, so we 
can apply the induction hypothesis to obtain 

maxr(G^[G,]) <i?fc(r'^l). 

Putting these pieces together, we have r (G*^) <k + Rk(\^^), 'AS desired. □ 

The recurrence Rk{n) is studied in [17] , where also the inequality Rk{n) < 
k ■ log{n/k) is derived]^ We thus have the following bound: 

Corollary 5. Let G be a loop-free digraph with n vertices and weak separator 
number at most k. Then r(G) < k ■ log{n/k) — 1. □ 



Here log denotes the binary logarithm. 
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This inequality is sharp already in the undirected case, see [T7]- Previously, a 
looser bound comparing cycle rank to a similar notion of weak separator number 
was given in [18] . It is easy to see that Corollary [s] improves upon the previous 
bound. 

We turn to the comparison with directed pathwidth. That measure was 
introduced by Reed, Seymour and Thomas (cf. [5j) as a generalization of path- 
width to digraphs. 

Definition 6. For a digraph G = {V,E), a directed path decomposition of G 
is a sequence W1W2 ■ ■ ■ Wr of subsets of V , called bags, such that 

a) each vertex is contained in at least one hag, 

h) for all i < j < k holds WiClWk ^ Wj, and 

c) for each edge (w, u) in E, there is a bag containing both endpoints, or there 
exist i,j with i < j such that the tail u is in Wi and the head v is in Wj. 

The width of a directed path decomposition is defined as the maximum cardinal- 
ity among all bags minus 1. The directed pathwidth is defined as the minimum 
width among all directed path decompositions for G. 

A directed path decomposition is normal, if adjacent bags may differ in at 
most one vertex, and it is easy to transform a directed path decomposition into 
a normal one. Based on normal path decompositions, it is not difficult to derive 
the following result: 

Lemma 7. Let G be a digraph. Then s(G) < dpw(G). □ 

How does cycle rank relate to directed pathwidth? We can answer this using 
directed elimination forests. 

Lemma 8. Let G be a digraph. Then dpw(G') < r(G). 

Proof. We prove by induction that each directed elimination forest of height k 
for G can be transformed into a directed path decomposition for G of width at 
most k. 

If fc = 0, then G is acyclic, and thus clearly admits a directed path decom- 
position of with 0. 

For the induction step, assume the directed elimination forest for G has roots 
{xi,Ci), (a;2, C2),. . . , {xr,Gr), with the strongly connected components C; in 
topological order. Let Gi = G[Gi] — Xi. Then Gi has cycle rank at most fc— 1. By 
induction assumption, each digraph Gi admits a directed path decomposition 
of width at most fc — 1. By adding the vertex Xi to each bag in the respective 
decomposition for Gi, we obtain a directed path-decomposition for G[Ci]. Con- 
catenating the r individual directed path decompositions while respecting the 
above topological order, we obtain a directed path decomposition of width at 
most fc for G, as desired. □ 

Altogether, we have derived the following chain of inequalities: 
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Theorem 9. Let G be a loop-free digraph with n vertices and weak separator 
number k. Then 

k < dpw(G) < t{G) < k ■ \og{n/k) - 1. □ 

Quite a few more structural complexity measures on digraphs were studied 
recently, such as directed tree-width, DAG-width, and Kelly-width. As detailed 
in [23], each of these measures is bounded below by a function that is linear in 
the weak separator numbei|^ On the other hand, all of those are bounded above 
by the directed pathwidth (cf. [24|), so Theorem |9] will also serve for comparing 
them with cycle rank, and with weak separator number. 

4 Computational Aspects of Cycle Rank 

4.1 Computational Complexity 

We turn to algorithmic questions. First, we classify the computational com- 
plexity of the decision problem CYCLE RANK: Given a digraph G and an 
integer fc, determining whether the cycle rank of G is at most k. 

Theorem 10. The CYCLE RANK problem is 'H'P -complete, and this still 
holds when requiring that the input digraph is strongly connected. 

Proof. Membership in NP can be seen by the equivalent definition using di- 
rected elimination forests: Let G = (V, E) denote the given digraph. Every 
elimination forest for G contains at most \V\ tree vertices, and each tree vertex 
is of size is at most \V\. A nondeterministic polynomial-time bounded Turing 
machine can guess such a witness, and then verify that it indeed constitutes an 
elimination forest of height at most k. 

For NP-hardness, we use a corresponding result known for the undirected 
case. Given a symmetric loop- free digraph G, it is easy to see (e.g. by |31l 
Lem. 2.2]) that an undirected elimination forest of height fc + 1 in the sense 
of [HI 131] corresponds to a directed elimination forest of height k in our sense 
(the term -1-1 accounts for the slightly different definition of height used in |31jV 
However, determining the minimum height among all undirected elimination 
forests is NP-complete, also for (strongly) connected undirected graphs [5]. □ 

Using tools from formal language theory, we will prove later that NP- 
hardness still holds for digraphs of maximum outdegree at most 2 and of maxi- 
mum total degree at most 4. 

4.2 Approximate Computation 

How to cope with this negative result? One possibility is to look for an ap- 
proximate solution. Indeed, it is known that for undirected graphs, the cycle 

■^The notion used in | 24| corresponds to our notion of weak separator number up to a 
constant factor. 
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rank problem admits an input-dependent polynomial-time approximation algo- 
rithm [3]. In the following, we devise a more general approximation algorithm, 
which covers also the case of unsymmetric digraphs. The basic pattern of our 
algorithm for directed cycle rank is again divide-and-conquer along separators. 

Theorem 11. The CYCLE RANK problem admits a polynomial-time ap- 
proximation within a factor o/ 0((log n)^/^). 

Proof. The following recursive procedure computes a directed elimination forest 
for the induced subgraph G[VF], where C is passed as parameter to the 
procedure. 

If G'[VF] consists of several strongly connected components, apply the pro- 
cedure recursively to each of these; The union of these results gives a directed 
elimination forest for G[W^]. 

Otherwise, use the polynomial-time algorithm from f53', Corollary 2.25] to 
find a small vertex subset S C W in G[W] with the property that every strongly 
connected component of G[W] — S has at most vertices. Then pass the 

digraph G[M^] — as parameter to the recursive procedure. Upon returning, 
the directed elimination forest F returned for G[VF] — S" is then extended, one 
by one, for each vertex s from S. 

More precisely, put the elements of S in arbitrary order. Then for given s 
in S, let X denote the set of vertices occurring before s. Assuming we have 
already computed a directed elimination forest for G[M^U A], we now show how 
to extend this to a forest for G[WUXUs]. Initially, the set X is empty, and we 
proceed for each s until X — S. Let Gi , . . . Gp denote those strongly connected 
components of the digraph G[W U A] for which G[W U {s} U IJ^ Gi] is strongly 
connected, and let Di, . . . , Dr denote the remaining strongly connected compo- 
nents in G[W^U A]. The elimination forest for G[iy U A] contains an elimination 
tree for each G[Gi], and for each G[Di]. Make up a new root (s, A U {s}), and 
attach the directed elimination trees for the digraphs G[Gi] as children to that 
new root. This gives a directed elimination tree for G[W U {s} U IJ^ Ci]. The 
union of this tree with the directed elimination trees for the strongly connected 
components Dr yields a directed elimination forest for G[W U A U s]. 

This completes the description of the subroutine for extending the forest. 

The recursion terminates as soon as the size of W decreases below /3(log n)^/^. 
In this case, simply return an (arbitrary) directed elimination forest for G[M^]. 

Here, the number /3 is a fixed, suitably chosen, constant coming from the 
analysis below. This completes the description of the algorithm. 

It remains to analyze the above algorithm. It is readily checked that the 
algorithm returns an elimination forest for G. For the performance guarantee, 
those recursive calls that simply partition the graph into strongly connected 
components do not add to the height of the resulting forest; if we restrict our 
attention to these recursive calls that compute a suitable vertex subset S, the 
depth of the recursion tree is O(logn). At each such step, we can find in 
polynomial time a suitable set S of size at most /3k\/log n, where k is the directed 
pathwidth of G, and (5 is some known constant (cf. j24i Corollary 2.25]). The 
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recursion terminates with an elimination forest of height at most /? • (logn)"^/^. 
Thus the overall height is bounded by 

/3-k- ^/logn ■ 0{\ogn) + /3 • {lognf/^ 0{k ■ (log 71)^/2), 

where k is the directed pathwidth of G. By Lemmajs] we have k < r(G). In this 
way, we have a polynomial-time 0((log n)"^/'^)-approximation for cycle rank. □ 

The above performance guarantee matches the best previous result known 
for the undirected case [1]. For other digraph complexity measures, such as D- 
width and directed pathwidth, approximation algorithms in a similar vein were 
recently given in |24| . 



4.3 Exact Computation 

In certain circumstances, an approximation guarantee within a factor 0((log n)^/^) 
may not suffice. Thus we also take a look at exact algorithms for computing 
the cycle rank. 

The nai've algorithm for determining cycle rank according to Definition [l] 
requires inspecting nl possibilities on a graph with n vertices, as witnessed by 
the complete graph . While one may not expect a polynomial-time algorithm, 
we can still do much better: 

Theorem 12. The cycle rank of an n-vertex digraph can be computed in time 
and space 0*(2"). 

Proof. We show how the characterization of the cycle rank of a digraph G = 
(F, E) in terms of the directed elimination forests from Definition |2] can be 
turned into a dynamic programming scheme. We only consider the case G it- 
self is nontrivially strongly connected — otherwise, we obtain the cycle rank by 
taking the minimum among the cycle ranks of the nontrivial strongly connected 
components of G. For a nontrivial strongly connected subset of vertices X CV 
and a vertex x Cz X, let t(x,X) denote the minimum height among all elimina- 
tion forests for G with root {x,X). Then r(G) = min^jgy r(ti, T^), so it suffices 
to design an algorithm computing r(w, V) for each v € V. By inspecting Defini- 
tion |2] we obtain the recurrence 

if G[X] — a: is acyclic 
maxy minj,gy t(jj, Y) otherwise 

Here Y runs over all nontrivial strongly connected components of G[X] — x 
(of which there can be at most \X\ — 1). Using the classic trick of memoization 
(see |26j). this recurrence can be easily transformed into a dynamic programming 
scheme with memoization that runs in time |6| • n°^^\^ where 6 C 2^ is the set 
of strongly connected subsets of the digraph G. □ 

The reader is invited to try out the above algorithm for the digraph depicted 
in Figure [T] The bottleneck in the above algorithm is the requirement of com- 
puting and storing the cycle rank for all elements of 6, namely of the family 
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of strongly connected subsets in the input digraph. For a complete digraph, 
we have \&\ = 2", but this bound can no longer be reached for digraphs of 
bounded maximum outdegree. For undirected graphs of maximum degree d, a 
nontrivial bound on the number of (weakly) connected subsets was established 
recently in jj- As it turns out, their bound allows the following generalization 
to the theory of digraphs, in that the original proof carries over with obvious 
modifications: 

Lemma 13. Let G be a digraph of order n with maximum outdegree at most d. 
Then the number of strongly connected subsets of V is at most 7" + n, with 
7 = (2'*+i - 1)1/(^+1)^ particular, for d = 2, we have 7 = 1.9129. □ 

On digraphs of bounded outdegree, we thus obtain the following improved 
bound on the running time of the above algorithm: 

Theorem 14. Let G be a digraph of order n with constant maximum outde- 
gree d. Then the cycle rank of G can be computed in time and space O* ((2 — £)"), 
where e is a constant depending on d. In particular, for digraphs of maximum 
outdegree 2, the cycle rank can be computed in time and space O* (1.9129"). □ 

It seems that Lemma [T3] has a host of algorithmic consequences. For illus- 
tration, recall that a vertex subset S" C of a digraph G is a directed feedback 
vertex set, if removing S from G leaves an acyclic digraph. Off the cuff, we can 
devise an exact algorithm for minimum directed feedback vertex set on sparse 
digraphs. 

Theorem 15. Let G be a digraph of order n with constant maximum outde- 
gree d. Then a minimum directed feedback vertex set of G can be computed in 
time and space O* ((2 — e)"), where s is a constant depending on d. In partic- 
ular, for digraphs of maximum outdegree 2, a minimum directed feedback vertex 
set can be computed in time and space O* (1.9129"). □ 

Proof. By duality, the task of enumerating all minimal directed feedback vertex 
sets is equivalent to enumerating all maximal acyclic subsets, that is, maxi- 
mal vertex subsets that induce a directed acyclic graph. Here, "minimal" and 
"maximal" are meant with respect to set inclusion. 

Since there is an algorithm enumerating all minimal directed feedback vertex 
sets (or, equivalently, all maximal acyclic subsets) with polynomial delay |36| . 
it only remains to derive a combinatorial bound on the number of such sets. A 
strongly connected subset S* C 1^ in G is called a minimal strongly connected 
subset, if S contains a vertex v such that S* — u is an acyclic subset. Clearly, 
in this case, 5' — u is a maximal acyclic subset. Thus, each minimal strongly 
connected subset S will give rise to at most IS*] < n maximal acyclic subsets; 
and each maximal acyclic subset can be obtained in this way from a minimal 
strongly connected subset. Thus the total number of maximal acyclic subsets 
in G is at most n times the number of (minimal) strongly connected subsets 
in G. The result now follows with Lemma [l3l □ 
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The above running time looks reasonable if we consider the following facts: 
First, even on digraphs of maximum outdegree at most 2, the problem is NP- 
complete |15) Problem GT7]. Second, the fastest known exact algorithm for 
digraphs of unbounded outdegree [32] runs in time O* (1.9977"). Third, easy 
examples show that digraphs with outdegree 2 can have at least 1.4142" minimal 
directed feedback vertex sets . 

5 Star Height of Regular Expressions 

As it turns out, the cycle rank of digraphs is intimately related to structural 
and descriptional complexity aspects of regular expressions. The star height of 
a regular language L, denoted by h(L), is defined as the minimum nesting depth 
of stars in any regular expression describing L. The following relation between 
star height and the cycle rank of nondeterministic finite automata (NFAs) was 
shown already in the seminal paper on star height. 

Theorem 16 (Eggan's Theorem). Let L he a regular language. Then 

h(L) = min{ r(A) | A an NFA accepting L } 

Here, r(v4) denotes the cycle rank of the digraph underlying the transition 
structure of A. 

As an aside, Eggan's Theorem was recently used to obtain a powerful lower 
bound technique for the minimum required length of regular expressions for a 
given regular language: 

Lemma 17 (Star Height Lemma, |18p. Let L he a regular language. If L admits 
a regular expression of length n, then n > 2^^^'^^'>\ 

The gist of the proof is that each regular expression can be converted into 
an equivalent NFA of comparable size, but whose transition structure is only 
poorly connected. The result then follows using Eggan's Theorem. In [T^, this 
method was used to prove the unexpected result that complementing regular 
languages can cause a doubly-exponential blow-up in the minimum required 
regular expression length. 

Of course, the minimum in Eggan's Theorem is taken over infinitely many 
NFAs, and indeed for more than two decades, it was unknown whether there ex- 
ists an algorithm deciding the STAR HEIGHT problem: given a deterministic 
finite automaton (DFA) A and an integer k, determine whether the star height of 
L{A) is at most k, a question raised in [T^]. Although the problem is now known 
to be decidable, the best known upper bouncQto date is EXPSPACE ^25]. To 
the best of our knowledge, nontrivial lower bounds are known only for the case 

*The noted upper bound holds more generally for a given NFA if also an NFA accepting 
the complement language is provided as part of the input. Recall that complementing a DFA 
does not affect its size, whereas complementing an NFA can cause an exponential blow-up in 
the required number of states |20| . 
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where the input is specified succinctly, as an NFA: Determining the star height 
of a language specified as an NFA is PSPACE-hard Yet, as illustrated 

in |23], a large multitude of natural questions about the language accepted by 
a given NFA is PSPACE-hard, whereas the corresponding question often be- 
come computationally easy if a DFA is given. Therefore, such a hardness result 
renders more service to understanding the effect of succinct input descriptions 
than to understanding the computational nature of the core problem at hand. 
That is why we deliberately stick to the convention to specify the input as a 
DFA. 

Here we settle the complexity of the star height problem for a subclass of the 
regular languages, namely the bideterministic languages. The decision problem 
BIDETERMINISTIC STAR HEIGHT is defined as follows: Given a bide- 
terministic finite automaton A and an integer fc, decide whether the star height 
of L{A) is at most k. 

Bideterministic finite automata have the special property that the star height 
problem of bideterministic languages boils down to determining the cycle rank 
of a digraph. The following theorem is proved in [29 : 

Theorem 18 (McNaughton's Theorem). Let L be a bideterministic language, 
and let A be the minimal trim (i.e., without a dead state) DFA accepting L. 
Then h(L) = r(yl). 

On the positive side, the algorithmic results from the previous section easily 
translate to a formal language setup using McNaughton's Theorem. For ap- 
proximating STAR HEIGHT, we have to resort to Eggan's Theorem, giving 
only an 0(ri)-approximation. In the bideterministic case, we have the following 
counterpart to Theorem 11 

Theorem 19. The BIDETERMINISTIC STAR HEIGHT problem ad- 
mits a polynomial-time approximation within a factor o/ 0((log n)^/^). □ 

We also have a natural counterpart to Theorem [141 

Theorem 20. Let A be a bideterministic finite automaton with n states over an 
input alphabet of size k. Then the star height of L(A) can be computed exactly, 
in time and space O* ((2 — e)"), where e is a constant depending on k. In 
particular, for the case of binary input alphabets, the star height can be computed 
in time and space O* {1.9129"). □ 

On the negative size, also the NP-hardness result for CYCLE RANK 
translates to its language-theoretic counterpart. Moreover, we show that already 
the case of binary input alphabets is that hard: 

Theorem 21. The BIDETERMINISTIC STAR HEIGHT problem is 
1S!P -complete, and this still holds when restricted to bideterministic automata 
over binary input alphabets. 

Proof. We first show NP-completeness for the case of unbounded alphabet size, 
and then provide a polynomial-time reduction to the case of binary alphabets. 
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For membership in NP, we use McNaughton's Theorem (Theorem 18) to 
reduce the problem to CYCLE RANK, and the latter is in NP by TheoremfTO 

To establish NP-hardness, we reduce from the problem of determining for a 
strongly connected digraph G — {E, V) and an integer k whether the cycle rank 
is at most k, which is NP-hard by Theorem [TO] For a vertex v in V, define 

L(G, v) — {w (z E* I w is a walk in G starting and ending in v }. 

A deterministic finite automaton A accepting L{G,v) has V as set of states 
and for each edge {x,y) G E a transition labeled {x,y) from x to y. The start 
and only accepting state is v. It is readily verified that A accepts L{G,v), is 
bideterministic, and that A is the minimal trim DFA for this language. By 



construction, t{A) = r(G), and r(^) = h(L) by Theorem 18 This completes 
the NP-completeness proof for unbounded alphabet size. 

We turn to the case of binary alphabets. Given an instance {A, k) of BIDE- 
TERMINISTIC STAR HEIGHT, we construct in polynomial time a bide- 
terministic finite automaton B over the alphabet {a,b}, such that the star 
height of B equals the star height of A. Assume the input alphabet of A 
is S = {ai, a2, . . . , Or}. The automaton B will accept the homomorphic im- 
age of L{A) under the homomorphism p : E — > {a, 6} given by p{ai) ~ a*6''+^^*, 
for 1 < i < r. It is known [30] that p preserves star height, that is, for every 
regular language L, the image of L under p is of the same star height as L. It 
remains to construct a bideterministic automaton B accepting p(L{A)) in poly- 
nomial time: automaton B will have the states of A, plus some extra states. 
For each state q copied from A, we add r states , ■ ■ - q^ and r more states 

q^,q2,...q~ to the state set of B. The transition relation of B is given by 
requiring that whenever there is a transition p q va. A, then B admits the 
sequence of transitions 

P^Pl ^P2 ^PT ^ 1r-i ■■■12 -^ll -^1- 

There are no other transitions in B. By construction, B accepts p{L{A)). It is 
easily verified that if A is bideterministic, then so is i?. □ 

Returning again to CYCLE RANK, we observe that the digraph under- 
lying a bideterministic automaton over a binary alphabet always has maximum 
outdegree at most 2 and maximum total degree at most 4. The correspon- 
dence given by McNaughton's Theorem between bideterministic automata and 
digraphs yields the following consequence: 

Corollary 22. The CYCLE RANK problem restricted to digraphs of maxi- 
mum outdegree at most 2 and total degree at most 4 remains ISIF-complete. □ 

6 Conclusion 

In this work, we explored measures for the complexity of digraphs, and their 
applications. We paid particular attention to the cycle rank of digraphs and its 
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relation to other digraph complexity measures, as well as its connection to the 
star height of regular languages. A tabular summary of our main algorithmic 
results is given in the Appendix. 

Regarding cycle rank, the undirected case seems to be much better under- 
stood than the general case. An intriguing open question is whether the cycle 
rank problem is fixed-parameter tractable. This is known to be the case on 
undirected graphs, see jS]. 

Regarding the star height problem, the picture is even less clear. The main 
problem, namely the decidability status, has been settled for more than 20 years 
now. Still, the computational complexity of this problem is not well understood. 
From the viewpoint of a computational complexity, we studied the "easiest hard 
case", and showed that (the non-succinct version of) this problem is NP-hard. 
Currently the best upper bound is EXPSPACE. Tightening the eminent 
gap between these bounds is surely a challenging theme for further research. 

Acknowledgment 

The author would like to thank Markus Holzer for carefully reading an earlier 
draft of this paper, and for providing some valuable suggestions. 

References 

[1] Amit Agarwal, Moses Charikar, Konstantin Makarychev, and Yury 
Makarychev. 0{\/\og n) approximation algorithms for min UnCut, min 
2CNF deletion, and directed cut problems. In Harold N. Gabow and Ronald 
Fagin, editors, 37th Annual ACM Symposium on Theory of Computing, 
pages 573-581, 2005. 

[2] Hannah Alpert. Rank numbers of grid graphs. Discrete Mathematics, 
310(23):3324-3333, 2010. 

[3] Dana Angluin. Inference of reversible languages. Journal of the ACM, 
29(3):741-765, 1982. 

[4] Amotz Bar-Noy, Panagiotis Cheilaris, Michael Lampis, Valia Mitsou, and 
Stathis Zachos. Ordered coloring grids and related graphs. Theoretical 
Computer Science, 2011. Accepted for publication. 

[5] Janos Barat. Directed path-width and monotonicity in digraph searching. 
Craphs and Combinatorics, 22(2): 161-172, 2006. 

[6] Dietmar Berwanger, Anuj Dawar, Paul W. Hunter, Stephan Kreutzer, and 
Jan Obdrzalek. The DAG-width of directed graphs. Journal of Combina- 
torial Theory, Series B, 2011. Accepted for publication. 

[7] Andreas Bjorklund, Thore Husfeldt, Petteri Kaski, and Mikko Koivisto. 
The travelling salesman problem in bounded degree graphs. In Luca Aceto, 



15 



Ivan Damgard, Leslie A. Goldberg, Magnus M. Halldorsson. Anna Ingolfs- 
dottir, and Igor Walkuwiewicz, editors, 35th International Colloquium on 
Automata, Languages and Programming (Part I), volume 5125 of Lecture 
Notes in Computer Science, pages 198-209. Springer, 2008. 

[8] Hans L. Bodlaender, Jitender S. Deogun, Klaus Jansen, Ton Kloks, Dieter 
Kratsch, Haiko Miiller, and Zsolt Tuza. Rankings of graphs. SIAM Journal 
on Discrete Mathematics, 11(1):168-181, 1998. 

[9] Hans L. Bodlaender, John R. Gilbert, Hjalnityr Hafsteinsson, and Ton 
Kloks. Approximating treewidth, pathwidth, frontsize, and shortest elimi- 
nation tree. Journal of Algorithms, 18(2):238-255, 1995. 

[10] Jianer Chen, Yang Liu, Songjian Lu, Barry O'SuUivan, and Igor Razgon. 
A fixed-parameter algorithm for the directed feedback vertex set problem. 
Journal of the ACM, 55(5):Article No. 21, 2008. 

[11] Reinhard Diestel. Graph Theory, volume 173 of Graduate Texts in Mathe- 
matics. Springer, 3rd edition, 2006. 

[12] Lawrence C. Eggan. Transition graphs and the star height of regular events. 
Michigan Mathematical Journal, 10(4):385-397, 1963. 

[13] Stanley C. Eisenstat and Joseph W. H. Liu. The theory of elimination trees 
for sparse unsymmetric matrices. SIAM Journal on Matrix Analysis and 
Applications, 26(3):686-705, 2005. 

[14] Robert Ganian, Petr Hlineny, Joachim Kneis, Alexander Langer, Jan Ob- 

drzalek, and Peter Rossmanith. On digraph width measures in param- 
eterized algorithmics. In Jianer Chen and Fedor V. Fomin, editors, 4th 
International Workshop on Parameterized and Exact Computation, volume 
5917 of Lecture Notes in Computer Science, pages 185-197. Springer, 2009. 

[15] Michael R. Garey and David S. Johnson. Computers and Intractability: 
A Guide to the Theory of NP- Completeness. A Series of Books in the 
Mathematical Sciences. W. H. Freeman, 1979. 

[16] Hermann Gruber. Digraph complexity measures and applications in for- 
mal language theory. In David Antos, Milan Ceska, Zdenek Kotasek, Mo- 

jmi'r Kfetfnsky, Ludek Matyska, and Tomas Vojnar, editors, ^i/i Workshop 
on Mathematical and Engineering Methods in Computer Science, Znojmo, 
Czech Republic, pages 60-67, 2008. 

[17] Hermann Gruber. On balanced separators, treewidth, and cycle rank. 
Preprint, 2010. Available online as arXiv:1012.1344vl [cs.DM]. 

[18] Hermann Gruber and Markus Holzer. Finite automata, digraph con- 
nectivity, and regular expression size. In Luca Aceto, Ivan Damgard, 
Leslie A. Goldberg, Magnus M. Halldorsson, Anna Ingolfedottir, and Igor 



16 



Walkuwiewicz, editors, 35th International Colloquium on Automata, Lan- 
guages and Programming (Part II), volume 5126 of Lecture Notes in Com- 
puter Science, pages 39-50. Springer, 2008. 

[19] Kosaburo Hashiguchi. Algorithms for determining relative star height and 
star height. Information and Computation, 78(2):124-169, 1988. 

[20] Markus Holzcr and Martin Kutrib. Nondeterministic descriptional com- 
plexity of regular languages. International Journal oj Foundations of Com- 
puter Science, 14(6):1087 1102, 2003. 

[21] John E. Hopcroft and Jeffrey D. Ullman. Introduction to Automata Theory, 
Languages and Computation. Addison- Wesley Series in Computer Science. 
Addison- Wesley, 1979. 

[22] Paul W. Hunter. LIFO-search on digraphs: A searching game for cycle- 
rank. In Olaf Owe, Martin Steffen, and Jan A. Telle, editors, 18th Interna- 
tional Symposium on Fundamentals of Computation Theory, volume 6914 
of Lecture Notes in Computer Science. Springer, 2011. 

[23] Harry B. Hunt HI and Daniel J. Rosenkrantz. Computational parallels be- 
tween the regular and context-free languages. SI AM Journal on Computing, 
7(1):99-114, 1978. 

[24] Shiva Kintali, Nishad Kothari, and Akash Kumar. Approximation algo- 
rithms for directed width parameters. Preprint, 2011. Available online as 
arXiv:1107.4824vl [cs.DS]. 

[25] Daniel Kirsten. On the complexity of the relative inclusion star height 
problem. Advances in Computer Science and Engineering, 5(3):173-211, 
2010. 

[26] Jon Kleinberg and Eva Tardos. Algorithm Design. The Morgan Kaufmann 
Series in Computer Architecture and Design. Addison- Wesley Longman 
Publishing Co., Inc., 2005. 

[27] Stephan Kreutzer and Sebastian Ordyniak. Digraph decompositions 
and monotonicity in digraph searching. Theoretical Computer Science, 
412(35) :4688-4703, 2011. 

[28] Michael Lampis, Georgia Kaouri, and Valia Mitsou. On the algorithmic 

effectiveness of digraph decompositions and complexity measures. Discrete 

Optimization, 8(1):129- 138, 2011. 

[29] Robert McNaughton. The loop complexity of pure-group events. Informa- 
tion and Control, 11 (1/2): 167-176, 1967. 

[30] Robert McNaughton. The loop complexity of regular events. Information 
Sciences, l(3):305-328, 1969. 



17 



[31] Jaroslav Nesetfil and Patrice Ossona de Mendez. Tree-depth, subgraph 
coloring and homomorphisni bounds. European Journal of Combinatorics, 
27(6):1022-1041, 2006. 

[32] Igor Razgon. Computing niininiuni directed feedback vertex set in 
O* (1.9977"). In Giuseppe F. Italiano, Eugenio Moggi, and Luigi Laura, 
editors. Proceedings of the 10th Italian Conference on Theoretical Com- 
puter Science, pages 70-81. World Scientific, 2007. 

[33] Neil Robertson and Paul D. Seymour. Graph minors. II. Algorithmic as- 
pects of tree- width. Journal of Algorithms, 7(3):309-322, 1986. 

[34] Mohammad Ali Safari. D-width: A more natural measure for directed 
tree width. In Joanna Jedrzejowicz and Andrzej Szepietowski, editors, 
30th International Symposium on Mathematical Foundations of Computer 
Science, volume 3618 of Lecture Notes in Computer Science, pages 745-756. 
Springer, 2005. 

[35] Robert Schreiber. A new implementation of sparse Gaussian elimination. 
ACM Transactions on Mathematical Software, 8(3):256-276, 1982. 

[36] Benno Schwikowski and Ewald Speckenmeyer. On enumerating all mini- 
mal solutions of feedback problems. Discrete Applied Mathematics, 117(1- 
3):253-265, 2002. 



A Appendix 



CYCLE RANK 



Instance. A digraph G and an integer k. 

Question. Is the cycle rank of G at most kl 

Good news. Approximable within 0((log7i)'^/^ 
(Thm 



11) 



Exact solution 



in polynomial time 
can be computed in 



time O* (1.9129") for digraphs with maximum outde- 
gree at most 2; and for unbounded outdegree in time 
0*(2") (Thm. [Til. 



Bad news. NP-complete (Thm. 10 1. Problem is NP-hard already 
for digraphs of maximum outdegree 2 and maximum 



total degree 4 (Cor. 22 1; NP-hard also for some classes 
of undirected graphs (e.g., bipartite and cobipartite) j8]. 
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DIRECTED FEEDBACK VERTEX SET 

Instance. A digraph G and an integer k. 

Question. Does G admit a directed feedback vertex set of cardi- 
nality at most fc? 

Good news. For digraphs with maximum outdegree at most 2, ex- 
act solution can be computed in time O* (1.9129") 



(Thm. 15 1: and in time O* (1.9977") for unbounded out- 
degree [32 . Problem is fixed-parameter tractable |10j . 

Bad news. NP-complete, already for digraphs of maximum outde- 
gree 2 dll Problem GT7]. 

BIDETERMINISTIC STAR HEIGHT 

Instance. A bideterministic finite automaton A and an integer k. 
Question. Is the star height of L{A) at most fc? 
Good news. Approximable within 0{{\ognf'/'^) in polynomial time 



(Thm. 19). Exact solution can be computed in 



time O* (1.9129") for binary alphabets; and for un- 



bounded alphabet size in time 0*(2") (Thm. 20 1 



Bad news. NP-complete; NP-hardness holds already for binary al- 



phabets (Thm. 21 ) 



STAR HEIGHT 



Instance. A deterministic finite automaton A and an integer fc. 

Question. Is the star height of L{A) at most fc? 

Good news. Problem is decidable [19 . Exact solution can be com- 
puted within exponential space and doubly exponential 
time 125]. 



Bad news. NP-hard. already for binary alphabets (Thm. 21 1. 

Problem is PSPACE-hard if input given by an nonde- 
terministic finite automaton in place of a deterministic 
one (231. 
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